Local Interpretable Model-agnostic Explanations for models trained to predict players' wages in the FIFA 23 game

We analyse the random forest model trained in previous assignments on the FIFA 23 game dataset. The model aims to predict players' wages given their statistics in the game.

We use Interpretable Model-agnostic to analyse feature importance.

We analyse stability of the method.

We compare the LIME method with the SHAP method implemented during the previous assignment.

We also compare LIME explanations for the random forest and the linear model.

Analysys of stability of the LIME method

We compare LIME explanations for three observations and for each use five different seeds.

We observe that the method is quite stable - the most important features are usually the same across the seeds (note that Top 2 is the same for all observations), there are some differences in less important features.

Comparison of the first chosen observation

The first seed: Point3_seed0_im1.png

The second seed: Point3_seed1_im1.png

The third seed: Point3_seed2_im1.png

The forth seed: Point3_seed3_im1.png

The fifth seed: Point3_seed4_im1.png

Comparison of the second chosen observation

The first seed: Point3_seed0_im2.png

The second seed: Point3_seed1_im2.png

The third seed: Point3_seed2_im2.png

The forth seed: Point3_seed3_im2.png

The fifth seed: Point3_seed4_im2.png

Comparison of the third chosen observation

The first seed: Point3_seed0_im3.png

The second seed: Point3_seed1_im3.png

The third seed: Point3_seed2_im3.png

The forth seed: Point3_seed3_im3.png

The fifth seed: Point3_seed4_im3.png

Comparison betweem LIME explanations and SHAP explanations

We observe that explanations of LIME and SHAP methods may differ even for the most important feautures (see comparison of the second chosen observation). Also Top 3 features are consistent only for one out of five observations (the forth chosen observation).

However, comparing the sets of Top 3 most important features, we observe that at least two elements overlap for all 5 observations. This suggest that both methods are useful in separating important and unimportant features.

Comparison of the first chosen observation

LIME explanations: Point2_dalex_forest_im1.png

SHAP explanations: Point4_shap_im1.png

Comparison of the second chosen observation

LIME explanations: Point2_dalex_forest_im2.png

SHAP explanations: Point4_shap_im2.png

Comparison of the third chosen observation

LIME explanations: Point2_dalex_forest_im3.png

SHAP explanations: Point4_shap_im3.png

Comparison of the forth chosen observation

LIME explanations: Point2_dalex_forest_im4.png

SHAP explanations: Point4_shap_im4.png

Comparison of the fifth chosen observation

LIME explanations: Point2_dalex_forest_im5.png

SHAP explanations: Point4_shap_im5.png

Comparison of LIME explanations between the tree model and the linear model

The most important features for the random forest model and the linear model usually differ. The tree model mostly focuses on Overall and Value in euro features, while the linear model often focuses on Stats and Position ratings.

We observe that the most important feautures are usually consistent between lime and dalex library. There are some differences in less important features.

Comparison of the first chosen observation:

The random forest explanation with dalex library: Point2_dalex_forest_im1.png

The random forest explanation with lime library: Point2_lime_forest_im1.png

The linear model explanation with dalex library: Point2_dalex_linear_im1.png

The linear model explanation with lime library: Point2_lime_linear_im1.png

Comparison of the second chosen observation:

The random forest explanation with dalex library: Point2_dalex_forest_im2.png

The random forest explanation with lime library: Point2_lime_forest_im2.png

The linear model explanation with dalex library: Point2_dalex_linear_im2.png

The linear model explanation with lime library: Point2_lime_linear_im2.png

Comparison of the third chosen observation:

The random forest explanation with dalex library: Point2_dalex_forest_im3.png

The random forest explanation with lime library: Point2_lime_forest_im3.png

The linear model explanation with dalex library: Point2_dalex_linear_im3.png

The linear model explanation with lime library: Point2_lime_linear_im3.png

Comparison of the forth chosen observation:

The random forest explanation with dalex library: Point2_dalex_forest_im4.png

The random forest explanation with lime library: Point2_lime_forest_im4.png

The linear model explanation with dalex library: Point2_dalex_linear_im4.png

The linear model explanation with lime library: Point2_lime_linear_im4.png

Comparison of the fifth chosen observation:

The random forest explanation with dalex library: Point2_dalex_forest_im5.png

The random forest explanation with lime library: Point2_lime_forest_im5.png

The linear model explanation with dalex library: Point2_dalex_linear_im5.png

The linear model explanation with lime library: Point2_lime_linear_im5.png

Appendix

Importing libraries and dataset

In [2]:
# 1. Import libraries

!pip3 install shap
!pip3 install dalex
!pip3 install lime
!pip3 install -q condacolab
import condacolab
condacolab.install()
!conda install -c conda-forge python-kaleido

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import plotly
import kaleido

import pickle
import lime
import shap
import dalex as dx

from math import isclose

from sklearn.linear_model import LinearRegression
from sklearn.ensemble import RandomForestRegressor
Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Requirement already satisfied: shap in /usr/local/lib/python3.7/site-packages (0.41.0)
Requirement already satisfied: packaging>20.9 in /usr/local/lib/python3.7/site-packages (from shap) (21.3)
Requirement already satisfied: pandas in /usr/local/lib/python3.7/site-packages (from shap) (1.3.5)
Requirement already satisfied: tqdm>4.25.0 in /usr/local/lib/python3.7/site-packages (from shap) (4.64.0)
Requirement already satisfied: numpy in /usr/local/lib/python3.7/site-packages (from shap) (1.21.6)
Requirement already satisfied: scipy in /usr/local/lib/python3.7/site-packages (from shap) (1.7.3)
Requirement already satisfied: slicer==0.0.7 in /usr/local/lib/python3.7/site-packages (from shap) (0.0.7)
Requirement already satisfied: numba in /usr/local/lib/python3.7/site-packages (from shap) (0.56.3)
Requirement already satisfied: cloudpickle in /usr/local/lib/python3.7/site-packages (from shap) (2.2.0)
Requirement already satisfied: scikit-learn in /usr/local/lib/python3.7/site-packages (from shap) (1.0.2)
Requirement already satisfied: pyparsing!=3.0.5,>=2.0.2 in /usr/local/lib/python3.7/site-packages (from packaging>20.9->shap) (3.0.9)
Requirement already satisfied: importlib-metadata in /usr/local/lib/python3.7/site-packages (from numba->shap) (5.0.0)
Requirement already satisfied: setuptools in /usr/local/lib/python3.7/site-packages (from numba->shap) (65.3.0)
Requirement already satisfied: llvmlite<0.40,>=0.39.0dev0 in /usr/local/lib/python3.7/site-packages (from numba->shap) (0.39.1)
Requirement already satisfied: python-dateutil>=2.7.3 in /usr/local/lib/python3.7/site-packages (from pandas->shap) (2.8.2)
Requirement already satisfied: pytz>=2017.3 in /usr/local/lib/python3.7/site-packages (from pandas->shap) (2022.5)
Requirement already satisfied: threadpoolctl>=2.0.0 in /usr/local/lib/python3.7/site-packages (from scikit-learn->shap) (3.1.0)
Requirement already satisfied: joblib>=0.11 in /usr/local/lib/python3.7/site-packages (from scikit-learn->shap) (1.2.0)
Requirement already satisfied: six>=1.5 in /usr/local/lib/python3.7/site-packages (from python-dateutil>=2.7.3->pandas->shap) (1.16.0)
Requirement already satisfied: zipp>=0.5 in /usr/local/lib/python3.7/site-packages (from importlib-metadata->numba->shap) (3.10.0)
Requirement already satisfied: typing-extensions>=3.6.4 in /usr/local/lib/python3.7/site-packages (from importlib-metadata->numba->shap) (4.4.0)
WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv
Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Requirement already satisfied: dalex in /usr/local/lib/python3.7/site-packages (1.5.0)
Requirement already satisfied: numpy>=1.20.3 in /usr/local/lib/python3.7/site-packages (from dalex) (1.21.6)
Requirement already satisfied: tqdm>=4.61.2 in /usr/local/lib/python3.7/site-packages (from dalex) (4.64.0)
Requirement already satisfied: scipy>=1.6.3 in /usr/local/lib/python3.7/site-packages (from dalex) (1.7.3)
Requirement already satisfied: plotly>=5.1.0 in /usr/local/lib/python3.7/site-packages (from dalex) (5.11.0)
Requirement already satisfied: setuptools in /usr/local/lib/python3.7/site-packages (from dalex) (65.3.0)
Requirement already satisfied: pandas>=1.2.5 in /usr/local/lib/python3.7/site-packages (from dalex) (1.3.5)
Requirement already satisfied: pytz>=2017.3 in /usr/local/lib/python3.7/site-packages (from pandas>=1.2.5->dalex) (2022.5)
Requirement already satisfied: python-dateutil>=2.7.3 in /usr/local/lib/python3.7/site-packages (from pandas>=1.2.5->dalex) (2.8.2)
Requirement already satisfied: tenacity>=6.2.0 in /usr/local/lib/python3.7/site-packages (from plotly>=5.1.0->dalex) (8.1.0)
Requirement already satisfied: six>=1.5 in /usr/local/lib/python3.7/site-packages (from python-dateutil>=2.7.3->pandas>=1.2.5->dalex) (1.16.0)
WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv
Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Collecting lime
  Downloading lime-0.2.0.1.tar.gz (275 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 275.7/275.7 kB 5.7 MB/s eta 0:00:00
  Preparing metadata (setup.py) ... done
Collecting matplotlib
  Downloading matplotlib-3.5.3-cp37-cp37m-manylinux_2_5_x86_64.manylinux1_x86_64.whl (11.2 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 11.2/11.2 MB 68.4 MB/s eta 0:00:00
Requirement already satisfied: numpy in /usr/local/lib/python3.7/site-packages (from lime) (1.21.6)
Requirement already satisfied: scipy in /usr/local/lib/python3.7/site-packages (from lime) (1.7.3)
Requirement already satisfied: tqdm in /usr/local/lib/python3.7/site-packages (from lime) (4.64.0)
Requirement already satisfied: scikit-learn>=0.18 in /usr/local/lib/python3.7/site-packages (from lime) (1.0.2)
Collecting scikit-image>=0.12
  Downloading scikit_image-0.19.3-cp37-cp37m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl (13.5 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 13.5/13.5 MB 72.2 MB/s eta 0:00:00
Collecting imageio>=2.4.1
  Downloading imageio-2.22.2-py3-none-any.whl (3.4 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 3.4/3.4 MB 84.3 MB/s eta 0:00:00
Requirement already satisfied: packaging>=20.0 in /usr/local/lib/python3.7/site-packages (from scikit-image>=0.12->lime) (21.3)
Collecting pillow!=7.1.0,!=7.1.1,!=8.3.0,>=6.1.0
  Downloading Pillow-9.2.0-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (3.1 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 3.1/3.1 MB 76.7 MB/s eta 0:00:00
Collecting PyWavelets>=1.1.1
  Downloading PyWavelets-1.3.0-cp37-cp37m-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_12_x86_64.manylinux2010_x86_64.whl (6.4 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 6.4/6.4 MB 77.7 MB/s eta 0:00:00
Collecting networkx>=2.2
  Downloading networkx-2.6.3-py3-none-any.whl (1.9 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.9/1.9 MB 73.1 MB/s eta 0:00:00
Collecting tifffile>=2019.7.26
  Downloading tifffile-2021.11.2-py3-none-any.whl (178 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 178.9/178.9 kB 18.5 MB/s eta 0:00:00
Requirement already satisfied: joblib>=0.11 in /usr/local/lib/python3.7/site-packages (from scikit-learn>=0.18->lime) (1.2.0)
Requirement already satisfied: threadpoolctl>=2.0.0 in /usr/local/lib/python3.7/site-packages (from scikit-learn>=0.18->lime) (3.1.0)
Collecting fonttools>=4.22.0
  Downloading fonttools-4.38.0-py3-none-any.whl (965 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 965.4/965.4 kB 54.3 MB/s eta 0:00:00
Requirement already satisfied: python-dateutil>=2.7 in /usr/local/lib/python3.7/site-packages (from matplotlib->lime) (2.8.2)
Requirement already satisfied: pyparsing>=2.2.1 in /usr/local/lib/python3.7/site-packages (from matplotlib->lime) (3.0.9)
Collecting kiwisolver>=1.0.1
  Downloading kiwisolver-1.4.4-cp37-cp37m-manylinux_2_5_x86_64.manylinux1_x86_64.whl (1.1 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.1/1.1 MB 55.1 MB/s eta 0:00:00
Collecting cycler>=0.10
  Downloading cycler-0.11.0-py3-none-any.whl (6.4 kB)
Requirement already satisfied: typing-extensions in /usr/local/lib/python3.7/site-packages (from kiwisolver>=1.0.1->matplotlib->lime) (4.4.0)
Requirement already satisfied: six>=1.5 in /usr/local/lib/python3.7/site-packages (from python-dateutil>=2.7->matplotlib->lime) (1.16.0)
Building wheels for collected packages: lime
  Building wheel for lime (setup.py) ... done
  Created wheel for lime: filename=lime-0.2.0.1-py3-none-any.whl size=283839 sha256=fd9c733d8e6b9fe66598b9c5dc531655bc3f60f438976dbc726c4288941d58a7
  Stored in directory: /root/.cache/pip/wheels/ca/cb/e5/ac701e12d365a08917bf4c6171c0961bc880a8181359c66aa7
Successfully built lime
Installing collected packages: tifffile, PyWavelets, pillow, networkx, kiwisolver, fonttools, cycler, matplotlib, imageio, scikit-image, lime
Successfully installed PyWavelets-1.3.0 cycler-0.11.0 fonttools-4.38.0 imageio-2.22.2 kiwisolver-1.4.4 lime-0.2.0.1 matplotlib-3.5.3 networkx-2.6.3 pillow-9.2.0 scikit-image-0.19.3 tifffile-2021.11.2
WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv
WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv
✨🍰✨ Everything looks OK!
Collecting package metadata (current_repodata.json): - \ | / - \ | / - \ | / - \ | / - done
Solving environment: | / - \ | / - \ | / - \ | / - \ | / done

# All requested packages already installed.

Retrieving notices: ...working... done
In [3]:
# 2. Load dataset and models from the previous homework

with open('X_train.pickle', 'rb') as handle:
    X_train_load = pickle.load(handle)

with open('y_train.pickle', 'rb') as handle:
    y_train_load = pickle.load(handle)

with open('X_test.pickle', 'rb') as handle:
    X_test_load = pickle.load(handle)

with open('y_test.pickle', 'rb') as handle:
    y_test_load = pickle.load(handle)

with open('tree_model.pickle', 'rb') as handle:
    forest_reg_load = pickle.load(handle)

with open('linear_model.pickle', 'rb') as handle:
    linear_model_load = pickle.load(handle)

print(X_train_load)
print(y_train_load)
print(X_test_load)
print(y_train_load)
print(forest_reg_load.predict(X_train_load))
print(linear_model_load.predict(X_train_load))
print(forest_reg_load.predict(X_test_load))
print(linear_model_load.predict(X_test_load))
       Overall  Potential  Value(in Euro)  Age  Height(in cm)  Weight(in kg)  \
1127        76         77         9000000   27            190             85   
6725        68         68         1200000   28            190             87   
4966        70         70         1100000   31            190             93   
1799        75         75         6000000   22            188             84   
4484        70         70         1300000   32            165             63   
...        ...        ...             ...  ...            ...            ...   
11284       64         64          525000   28            183             83   
11964       63         69          700000   23            176             70   
5390        69         69          375000   35            187             85   
860         77         77        10500000   29            176             73   
15795       59         59          220000   29            170             70   

       TotalStats  BaseStats  Release Clause  Weak Foot Rating  ...  \
1127         1808        392               0                 3  ...   
6725         1611        344         1600000                 2  ...   
4966         1594        338         1600000                 3  ...   
1799         1717        364        11400000                 3  ...   
4484         1773        365         2200000                 3  ...   
...           ...        ...             ...               ...  ...   
11284        1643        355          919000                 2  ...   
11964        1625        353               0                 3  ...   
5390         1672        359          750000                 3  ...   
860          2037        430        20000000                 3  ...   
15795        1564        335          363000                 4  ...   

       Best_Position_RM  Best_Position_RW  Best_Position_RWB  \
1127                  0                 0                  0   
6725                  0                 0                  0   
4966                  0                 0                  0   
1799                  0                 0                  0   
4484                  0                 0                  0   
...                 ...               ...                ...   
11284                 0                 0                  0   
11964                 0                 0                  0   
5390                  0                 0                  0   
860                   0                 0                  0   
15795                 0                 0                  0   

       Best_Position_ST  Attacking_Work_Rate_High  Attacking_Work_Rate_Low  \
1127                  1                         0                        0   
6725                  1                         0                        0   
4966                  0                         0                        1   
1799                  0                         0                        0   
4484                  0                         0                        0   
...                 ...                       ...                      ...   
11284                 0                         0                        1   
11964                 0                         0                        0   
5390                  0                         0                        1   
860                   0                         1                        0   
15795                 0                         0                        0   

       Attacking_Work_Rate_Medium  Defensive_Work_Rate_High  \
1127                            1                         0   
6725                            1                         0   
4966                            0                         1   
1799                            1                         0   
4484                            1                         0   
...                           ...                       ...   
11284                           0                         1   
11964                           1                         0   
5390                            0                         1   
860                             0                         0   
15795                           1                         0   

       Defensive_Work_Rate_Low  Defensive_Work_Rate_Medium  
1127                         0                           1  
6725                         1                           0  
4966                         0                           0  
1799                         0                           1  
4484                         1                           0  
...                        ...                         ...  
11284                        0                           0  
11964                        0                           1  
5390                         0                           0  
860                          0                           1  
15795                        0                           1  

[14831 rows x 111 columns]
1127     48000
6725      6000
4966      8000
1799     13000
4484     18000
         ...  
11284     2000
11964     2000
5390      6000
860      46000
15795     5000
Name: Wage(in Euro), Length: 14831, dtype: int64
       Overall  Potential  Value(in Euro)  Age  Height(in cm)  Weight(in kg)  \
10157       65         75         1500000   22            180             79   
3617        72         72         1900000   31            179             77   
4894        70         73         2100000   26            183             73   
2315        74         81         8000000   23            178             72   
2177        74         75         5000000   26            175             71   
...        ...        ...             ...  ...            ...            ...   
9049        66         66          725000   30            170             72   
14757       61         74          800000   21            184             78   
6779        68         76         2500000   23            182             73   
3269        72         72         2300000   30            185             82   
11602       64         64          500000   29            181             75   

       TotalStats  BaseStats  Release Clause  Weak Foot Rating  ...  \
10157        1561        344         3600000                 2  ...   
3617         1987        409         3600000                 4  ...   
4894         1975        410         4600000                 2  ...   
2315         1803        393        16800000                 3  ...   
2177         1854        385               0                 4  ...   
...           ...        ...             ...               ...  ...   
9049         1777        371         1600000                 3  ...   
14757        1485        333         1600000                 3  ...   
6779         1774        380         4099999                 3  ...   
3269         1825        387         5100000                 4  ...   
11602        1580        354          875000                 4  ...   

       Best_Position_RM  Best_Position_RW  Best_Position_RWB  \
10157                 0                 0                  0   
3617                  1                 0                  0   
4894                  0                 0                  0   
2315                  0                 0                  0   
2177                  0                 0                  0   
...                 ...               ...                ...   
9049                  0                 0                  0   
14757                 0                 0                  0   
6779                  0                 0                  0   
3269                  0                 0                  0   
11602                 0                 0                  0   

       Best_Position_ST  Attacking_Work_Rate_High  Attacking_Work_Rate_Low  \
10157                 0                         0                        0   
3617                  0                         0                        0   
4894                  0                         1                        0   
2315                  0                         0                        0   
2177                  1                         1                        0   
...                 ...                       ...                      ...   
9049                  0                         0                        1   
14757                 0                         0                        0   
6779                  0                         0                        0   
3269                  1                         0                        0   
11602                 0                         0                        0   

       Attacking_Work_Rate_Medium  Defensive_Work_Rate_High  \
10157                           1                         0   
3617                            1                         1   
4894                            0                         0   
2315                            1                         0   
2177                            0                         0   
...                           ...                       ...   
9049                            0                         0   
14757                           1                         0   
6779                            1                         0   
3269                            1                         0   
11602                           1                         0   

       Defensive_Work_Rate_Low  Defensive_Work_Rate_Medium  
10157                        0                           1  
3617                         0                           0  
4894                         0                           1  
2315                         0                           1  
2177                         0                           1  
...                        ...                         ...  
9049                         0                           1  
14757                        0                           1  
6779                         0                           1  
3269                         1                           0  
11602                        0                           1  

[3708 rows x 111 columns]
1127     48000
6725      6000
4966      8000
1799     13000
4484     18000
         ...  
11284     2000
11964     2000
5390      6000
860      46000
15795     5000
Name: Wage(in Euro), Length: 14831, dtype: int64
[45130.  5730.  8335. ...  5429. 41640.  4170.]
[36461.14113481  4366.01557268  9049.04479392 ...  3726.18901892
 33882.66952547  3284.73218782]
[ 1992.  12406.5 10253.5 ...  4265.  14385.5  2641. ]
[ 5452.90561019 11907.54997679 10061.40845932 ...  8209.27849936
 10756.15177953  5845.96896926]

Analysys of the model

In [11]:
# 1. Observe predictions of 5 observations (POINT 1)

observations = X_test_load.sample(5, random_state = 1)

predictions_forest = forest_reg_load.predict(observations)

predictions_linear = linear_model_load.predict(observations)

print(observations)
print(predictions_forest)
print(predictions_linear)
       Overall  Potential  Value(in Euro)  Age  Height(in cm)  Weight(in kg)  \
34          87         87        63000000   32            185             76   
8193        67         67          825000   31            183             85   
3106        72         72         2400000   28            174             69   
12140       63         69          700000   22            180             79   
11521       64         65          550000   28            192             86   

       TotalStats  BaseStats  Release Clause  Weak Foot Rating  ...  \
34           2140        443       104000000                 4  ...   
8193         1880        384         1600000                 3  ...   
3106         1917        390         3800000                 3  ...   
12140        1593        351         1400000                 2  ...   
11521        1515        327          784000                 2  ...   

       Best_Position_RM  Best_Position_RW  Best_Position_RWB  \
34                    0                 0                  0   
8193                  0                 0                  0   
3106                  0                 0                  0   
12140                 0                 0                  0   
11521                 0                 0                  0   

       Best_Position_ST  Attacking_Work_Rate_High  Attacking_Work_Rate_Low  \
34                    0                         1                        0   
8193                  0                         0                        0   
3106                  0                         0                        0   
12140                 0                         1                        0   
11521                 0                         0                        0   

       Attacking_Work_Rate_Medium  Defensive_Work_Rate_High  \
34                              0                         1   
8193                            1                         0   
3106                            1                         0   
12140                           0                         0   
11521                           1                         0   

       Defensive_Work_Rate_Low  Defensive_Work_Rate_Medium  
34                           0                           0  
8193                         0                           1  
3106                         0                           1  
12140                        0                           1  
11521                        0                           1  

[5 rows x 111 columns]
[174720.    7040.   12525.5   2482.5   1698.5]
[146171.12695566   9739.75500015   6601.38942177   5714.57255113
   2513.08255573]
In [25]:
# 2. Calculate lime decomposition for selected observations with lime library for the forest model (POINT 2, 5)

lime_explainer = lime.lime_tabular.LimeTabularExplainer(
    training_data=X_train_load.values,  
    feature_names=X_train_load.columns,
    mode="regression"
)

plot_id = 1
for i in range(len(observations)):
    lime_explanation_forest = lime_explainer.explain_instance(
        data_row=observations.iloc[i],
        predict_fn=lambda d: forest_reg_load.predict(d)
    )
    _ = lime_explanation_forest.as_pyplot_figure()
    plt.savefig('Point2_lime_forest_im' + str(plot_id) + '.png', dpi=300, bbox_inches='tight')
    plot_id += 1
X does not have valid feature names, but RandomForestRegressor was fitted with feature names
X does not have valid feature names, but RandomForestRegressor was fitted with feature names
X does not have valid feature names, but RandomForestRegressor was fitted with feature names
X does not have valid feature names, but RandomForestRegressor was fitted with feature names
X does not have valid feature names, but RandomForestRegressor was fitted with feature names
In [26]:
# 3. Calculate lime decomposition for selected observations with lime library for the forest model (POINT 2, 5)

plot_id = 1
for i in range(len(observations)):
    lime_explanation_linear = lime_explainer.explain_instance(
        data_row=observations.iloc[i],
        predict_fn=lambda d: linear_model_load.predict(d)
    )
    _ = lime_explanation_linear.as_pyplot_figure()
    plt.savefig('Point2_lime_linear_im' + str(plot_id) + '.png', dpi=300, bbox_inches='tight')
    plot_id += 1
X does not have valid feature names, but LinearRegression was fitted with feature names
X does not have valid feature names, but LinearRegression was fitted with feature names
X does not have valid feature names, but LinearRegression was fitted with feature names
X does not have valid feature names, but LinearRegression was fitted with feature names
X does not have valid feature names, but LinearRegression was fitted with feature names
In [20]:
# 4. Calculate lime decomposition for selected observations with dalex library for the forest model (POINT 2, 5)

explainer_forest_dx = dx.Explainer(forest_reg_load, X_train_load, y_train_load, verbose = False)
plot_id = 1
for i in range(len(observations)):
    explanation_forest_dx = explainer_forest_dx.predict_surrogate(observations.iloc[i])
    explanation_forest_dx.plot()
    plt.savefig('Point2_dalex_forest_im' + str(plot_id) + '.png', dpi=300, bbox_inches='tight')
    plot_id += 1
X does not have valid feature names, but RandomForestRegressor was fitted with feature names
In [27]:
# 5. Calculate lime decomposition for selected observations with dalex library for the linear model (POINT 2, 5)

explainer_linear_dx = dx.Explainer(linear_model_load, X_train_load, y_train_load, verbose = False)
plot_id = 1
for i in range(len(observations)):
    explanation_linear_dx = explainer_linear_dx.predict_surrogate(observations.iloc[i])
    explanation_linear_dx.plot()
    plt.savefig('Point2_dalex_linear_im' + str(plot_id) + '.png', dpi=300, bbox_inches='tight')
    plot_id += 1
X does not have valid feature names, but LinearRegression was fitted with feature names
In [35]:
# 6. Calculate Shapley values for selected observations with shap library (POINT 4)

explainer_shap = shap.TreeExplainer(forest_reg_load)
shap_values = explainer_shap.shap_values(observations)

plot_id = 1
for i in range(len(observations)):
    shap.bar_plot(shap_values[i], feature_names = observations.columns, show = False)
    plt.savefig('Point4_shap_im' + str(plot_id) + '.png', dpi=300, bbox_inches='tight')
    plot_id += 1
In [39]:
# 7. Analyze stability of the method

plot_id = 1
for i in range(3):
    for seed in range(5):
        np.random.seed(seed)
        explanation_forest_dx = explainer_forest_dx.predict_surrogate(observations.iloc[i])
        explanation_forest_dx.plot()
        plt.savefig('Point3_seed' + str(seed) + '_im' + str(plot_id) + '.png', dpi=300, bbox_inches='tight')
    plot_id += 1